Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide CAII batch embedding for better performance #35

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

conradocloudera
Copy link
Collaborator

Currently on draft as I'm not sure how to test

@jkwatson
Copy link
Collaborator

jkwatson commented Dec 5, 2024

I think we should get this in, even if we haven't figured out how to test it yet. We can test it in-situ at least for starters.

@conradocloudera conradocloudera marked this pull request as ready for review December 9, 2024 17:34
@conradocloudera conradocloudera merged commit 2dac585 into main Dec 9, 2024
1 check passed
@conradocloudera conradocloudera deleted the cm/caii-embedding-batch branch December 9, 2024 17:34
ewilliams-cloudera added a commit that referenced this pull request Dec 12, 2024
* upgrade everything

* small refactor for params, update loading

* add bedrock converse

* fix loading

* Clean up Cohere suggested questions

* Add property-based test for process_response() (#56)

* Add hypothesis

* Add property-based test for process_response()

* Shorten variable

* Formatting

* Add type annotations

* Fix type annotation

* hacking on startup scripts

* hacking on startup scripts, moar

* fix wrong dir

* try having the java side restart itself if it dies

* see output from java startup

* add debug info

* add the executable bit

* change the flags

* Add docstrings for tests

* refactor datasourceId

* update to exclude 405b model and default to 8b

* update readme for new cohere

* fix broken tests monkeypatching

* "wip on creating with models and response chunks"

* wip on modal updates

* commit java updates

* wip on populating the chat setting modal

* set up ui for updating a session

* add update method

* use updated session for chat

* remove query configuration from the chat context

* refactoring fe and fixing bug with empty model

* remove the datasource id from the context and use the active session instead

* Update release version to 1.4.0-beta

* Support multiple embedding models (#59)

* add embedding model to the data source in the java API

* embedding model used from the datasource while indexing

* replace the rest of the embedding model defaults

* "test & fix bugs with embedding variability"

* small refactoring to make embedding & llm caii methods look the same

* fix linting issues

* add a todo for a failing property test case

* remove unused import

---------

Co-authored-by: Elijah Williams <ewilliams@cloudera.com>

* Provide CAII batch embedding for better performance (#35)

* CAII endpoint discovery (#60)

* "wip on endpoint listing"

* "wip on list_endpoints typing"

* "refactoring to endpoint object"

* "wip filtering"

* "endpoints queried!"

* "refactoring"

* "wip on cleaning up types"

* "type cleanup complete"

* "moving files"

* "use a dummy embedding model for deletes"

* fix some bits from merge, get evals working again with CAII, tests passing

* formatting

* clean up ruff stuff

* use the chat llm for evals

* fix mypy for reformatting

* "wip on java reconciler"

* "reconciler don't do no model; start python work"

* "python - updating for summarization model"

* "comment out batch embeddings to get it working again"

* add handling for no summarization in the files table

* finish up ui and python for summarization

* make sure to update the time-updated fields on data sources and chat sessions

* use no-op models when we don't need real ones for summary functionality

* Update release version to dev-testing

* use the summarization llm when summarizing summaries

---------

Co-authored-by: Elijah Williams <ewilliams@cloudera.com>
Co-authored-by: actions-user <actions@github.com>

* Update release version to 1.4.0

* pass the original filename from java-> python so we don't need s3 metadata to store it

* don't read the whole directory when summarizing docs

* "refactor java to use RagFileService"

* remove seaweedfs experiment

* Make mypy happy (#62)

* Refactor summary index to isolate the logic (#63)

* Refactor summary index to isolate the logic

* fix tests

* handle race condition

* handle mypy

* ignore errors if the directory doesn't exist

---------

Co-authored-by: jwatson <jkwatson@gmail.com>

* image

* Update catalog entry to match the official one (#66)

* Update local catalog with official info

* add the git-ref back

* add the html long description (#67)

* Shuffle API for data sources for easier human consumption (#68)

* Shuffle API for data sources for easier human consumption

* make mypy happy

* remove prints

* wip o fs rag file uploader

* "now we're thinking with overtime"

* Revert ""now we're thinking with overtime""

This reverts commit 3c93206.

* get the databases directory from the environment (in local_dev)
python file storage abstraction
python tests currently broken
real AMP startup script needs new env var

* add a todo

* merge from main

* properly override the configuration in pytest configure to point at a temp directory

* get the tests passing with filesystem file handoff

* update project metadata to support new local filesystem storage

* Update release version to dev-testing

* fix java

* cleanup after switching tests to use the local filesystem

* Remove unused settings (#70)

* remove unused dep

* fix circular dep and refactor doc storage

* Update release version to 1.4.0

* Summarize the data store on every document summarization (#69)

* fix bug with s3 path when the prefix is not provided (#72)

* add --reload to the fastapi startup_app script

* Avoid global variables and use ephemeral folder for tests (#71)

* Avoid global variables and use ephemeral folder for tests

* fix with merge to main

* Remove print

* lint

* refetch knowledge base summary on doc summary change

* Bump @eslint/plugin-kit (#16)

Bumps the npm_and_yarn group with 1 update in the /ui directory: [@eslint/plugin-kit](https://github.com/eslint/rewrite).


Updates `@eslint/plugin-kit` from 0.2.2 to 0.2.3
- [Release notes](https://github.com/eslint/rewrite/releases)
- [Changelog](https://github.com/eslint/rewrite/blob/main/release-please-config.json)
- [Commits](eslint/rewrite@plugin-kit-v0.2.2...plugin-kit-v0.2.3)

---
updated-dependencies:
- dependency-name: "@eslint/plugin-kit"
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: jwatson <jkwatson@gmail.com>
Co-authored-by: Michael Liu <mliu@cloudera.com>
Co-authored-by: actions-user <actions@github.com>
Co-authored-by: conradocloudera <csilvamiranda@cloudera.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants